-
Notifications
You must be signed in to change notification settings - Fork 60
feat(venice): add additional models to Venice.ai provider #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add Llama 3.1 models (405B, 70B, 8B) - Add Deepseek Coder V2 for coding tasks - Add Qwen 32B and 72B models - Add Mistral Nemo - Add Hermes 3 405B Expands Venice.ai model selection from 5 to 13 models, providing more options for different use cases including coding, reasoning, and cost-effective inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR expands the Venice.ai provider model selection from 5 to 13 models, offering users more options for different use cases including coding, reasoning, and cost-effective inference.
- Adds 8 new models: Llama 3.1 variants (405B, 70B, 8B), Deepseek Coder V2, Qwen 32B and 72B, Mistral Nemo, and Hermes 3 405B
- Updates default_small_model_id from mistral-31-24b to llama-3.2-3b
- Adjusts Llama 3.2 3B pricing to be more cost-effective
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "cost_per_1m_in": 0.05, | ||
| "cost_per_1m_out": 0.05, |
Copilot
AI
Nov 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pricing for Llama 3.2 3B has been reduced from 0.15/0.6 to 0.05/0.05. This represents a 3x reduction in input costs and 12x reduction in output costs. Verify this significant pricing change is accurate with Venice.ai's current pricing, as such a substantial decrease could impact cost calculations for users.
| "cost_per_1m_in": 0.05, | |
| "cost_per_1m_out": 0.05, | |
| "cost_per_1m_in": 0.15, | |
| "cost_per_1m_out": 0.6, |
| "id": "llama-3.1-8b", | ||
| "name": "Llama 3.1 8B", | ||
| "cost_per_1m_in": 0.1, | ||
| "cost_per_1m_out": 0.1, | ||
| "cost_per_1m_in_cached": 0, | ||
| "cost_per_1m_out_cached": 0, | ||
| "context_window": 128000, | ||
| "default_max_tokens": 4096, | ||
| "can_reason": true, |
Copilot
AI
Nov 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All newly added Llama 3.1 models (8B, 70B, 405B) have 'can_reason' set to true, but the existing Llama 3.2 3B and Llama 3.3 70B models have 'can_reason' set to false. This inconsistency is unclear - verify whether the reasoning capability designation is correct across all Llama model versions, as Llama 3.1 and 3.2 are similar model families.
Summary
Expands Venice.ai model selection from 5 to 13 models, providing more options for different use cases.
Changes
Motivation
Venice.ai offers a wide range of models beyond the initial 5 included in Catwalk. This PR adds 8 additional popular models that are commonly used for:
Testing
Tested with VeniceCode (Venice.ai-optimized fork of Crush) to ensure all models work correctly with the OpenAI-compatible API.
Related